Automatic Identification of Phonetic Similarity Based on Underspecification

نویسندگان

  • Mark Kane
  • Julie Mauclair
  • Julie Carson-Berndsen
چکیده

This paper presents a novel approach to the identification of phonetic similarity using properties observed during the speech recognition process. Experiments are presented whereby specific phones are removed during the training phase of a statistical speech recognition system so that the behaviour of the system can be analysed to see which alternative phone is selected. The domain of the analysis is restricted to specific contexts and the alternatively recognised (or substituted) phones are analysed with respect to a number of factors namely, the common phonetic properties, the phonetic neighbourhood and the frequency of occurrence with respect to a particular corpus. The results indicate that a measure of phonetic similarity based on alternatively recognised observed properties can be predicted based on a combination of these factors and as such can serve as an important additional source of information for the purposes of modelling pronunciation variation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

String Similarity Measures and PAM-like Matrices for Cognate Identification

We present a new automatic learning system for the identification of cognates, words that derive from a common ancestor and share the same etymological origin. Our approach combines and adapts several techniques developed for biological sequence analysis to the natural language processing environment. We design a linguistic-inspired matrix to align sensibly our training dataset. We introduce a ...

متن کامل

Against Underspecification in Speech Errors

This paper argues against the use of phonological underspecification in feature matrices on the basis of speech error data. Stemberger 1991 argues that phonological underspecification influences the similarity of phonemes. He claims underspecified features do not count toward similarity, based on an analysis of phoneme confusions in a naturally occurring speech error corpus. Using the same corp...

متن کامل

Automatic identification of confusable drug names

OBJECTIVE Many hundreds of drugs have names that either look or sound so much alike that doctors, nurses and pharmacists can get them confused, dispensing the wrong one in errors that can injure or even kill patients. METHODS AND MATERIAL We propose to address the problem through the application of two new methods-one based on orthographic similarity ("look-alike"), and the other based on pho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009